Document Image Matching Based on Component Blocks

نویسندگان

  • Hanchuan Peng
  • Fuhui Long
  • Wan-Chi Siu
  • Zheru Chi
  • David Dagan Feng
چکیده

Document image matching is the key technique for document registration and retrieval. In this paper, a new matching algorithm based on document component block list and component block tree is proposed. Our method can effectively make use of the local information of each page block and the global information of page layout, while it is also robust to image distortion, filled-in text, and noises. This algorithm is then refined and applied to automatic data extraction of column forms. A demonstrating software package has been developed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Document image template matching based on component block list

Document image matching is the key technique for document image registration and retrieval. In this paper, a new matching method based on document component block list (CBL) is proposed. A document image is ®rstly parsed into a number of component blocks that are de®ned as non-adherent rectangular areas of substantial document contents. Then these blocks are organized as a list, on which severa...

متن کامل

Document Image Recognition Based on Template Matching of Component Block Projections

Document Image Recognition (DIR), a very useful technique in office automation and digital library applications, is to find the most similar template for any input document image in a prestored template document image data set. Existing methods use both local features and global layout information. In this paper, we propose a novel algorithm based on the global matching of Component Block Proje...

متن کامل

Connected Component Based Word Spotting on Persian Handwritten image documents

Word spotting is to make searchable unindexed image documents by locating word/words in a doc-ument image, given a query word. This problem is challenging, mainly due to the large numberof word classes with very small inter-class and substantial intra-class distances. In this paper, asegmentation-based word spotting method is presented for multi-writer Persian handwritten doc-...

متن کامل

A Block-Grouping Method for Image Denoising by Block Matching and 3-D Transform Filtering

Image denoising by block matching and threedimensionaltransform filtering (BM3D) is a two steps state-ofthe-art algorithm that uses the redundancy of similar blocks innoisy image for removing noise. Similar blocks which can havesome overlap are found by a block matching method and groupedto make 3-D blocks for 3-D transform filtering. In this paper wepropose a new block grouping algorithm in th...

متن کامل

Color scene transform between images using Rosenfeld-Kak histogram matching method

In digital color imaging, it is of interest to transform the color scene of an image to the other. Some attempts have been done in this case using, for example, lαβ color space, principal component analysis and recently histogram rescaling method. In this research, a novel method is proposed based on the Resenfeld and Kak histogram matching algorithm. It is suggested that to transform the color...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000